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Abstract. In the vicinity of a solution of a nonlinear programming problem at which both 
, strict complementarity and linear independence of the active constraints may fail to hold, we 

describe a technique for distinguishing weakly active from strongly active constraints. We show 
that this information can be used to modify the sequential quadratic programming algorithm 
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1. Introduction 

Consider the following nonlinear programming problem with inequality con- 
straints: 

> ; NLP: min <j>{z) subject to g{z) < 0, (1) 

ON ' 

where </> : R™ — > R and g : R™ — * R m are twice Lipschitz continuously differ- 
entiable functions. Optimality conditions for (|l|) can be derived from the La- 
grangian for (fil), which is 



C{z,X) = <j ) {z)+\ T g{z), (2) 



where A 6 R m is the vector of Lagrange multipliers. When a constraint qualifi- 
cation holds at z* (see discussion below), the first-order necessary conditions for 
z* to be a local solution of (0) arc that there exists a vector A* G R m such that 

£ z (z*,\*) = 0, g(z*)<0, A*>0, (X*) T g(z*)=Q. (3) 

•i-H . 

These relations are the well-known Karush-Kuhn- Tucker (KKT) conditions. The 

H , set B of active constraints at z* is 

CZ i 

B={i = l,2,...,m\ gi (z*)=0}. (4) 
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It follows immediately from (|^) that we can have A* > only if i G B. The weakly 
active constraints are identified by the indices i G B for which A* = for all 
A* satisfying (^J). Conversely, the strongly active constraints are those for which 
A* > for at least one multiplier A* satisfying ([|). The strict complementarity 
condition holds at z* if there are no weakly active constraints. 

We are interested in degenerate problems, those for which the active con- 
straint gradients at the solution is linearly dependent or the strict complemen- 
tarity condition fails to hold (or both). The first part of our paper describes a 
technique for partitioning B into weakly active and strongly active indices. Sec- 
tion H builds on the technique described by Facchinei, Fischer, and Kanzow || 
for identifying B. Our technique requires the solution of a sequence of closely 
related linear programming subproblcms in which the set of strongly active in- 
dices is assembled progressively. Solution of one additional linear program yields 
a Lagrange multiplier estimate A such that the components Xi for all strongly 
active indices i are bounded below by a positive constant. 

In the second part of the paper, we use the cited technique to adjust the La- 
grange multiplier estimate between iterations of the stabilized sequential quadratic 
programming (sSQP) algorithm described by Wright |l8| and Hager ||. The re- 
sulting technique has the advantage that it converges superlinearly under weaker 
conditions than considered in these earlier papers. We can drop the assumption 
of strict complementarity and a "sufficiently interior" starting point made in 
p8[ , and we do not need the stronger second-order conditions of j|] . Motivation 
for the sSQP approach came from work on primal-dual interior-point algorithms 
described in JT9|Jl^ ]. It is also closely related to the method of multipliers and 
the "recursive successive quadratic programming" approach of Bartholomew- 
Biggs 0. (See Wright [|l6| Section 6] for a discussion of the similarities.) 

Other work on stabilization of the SQP approach to yield superlinear con- 
vergence under weakened conditions has been performed by Fischer M and 



Wright 16 . Fischer proposed an algorithm in which an additional quadratic 
program is solved between iterations of SQP in order to adjust the Lagrange 
multiplier estimate. He proved superlinear convergence under conditions that 
are weaker than the standard assumptions but stronger than the ones made in 
this paper. Wright described superlinear local convergence properties of a class 
of inexact SQP methods and showed that sSQP and Fischer's method could be 
expressed as members of this class. This paper also introduced a modification of 
standard SQP that enforced only a subset of the linearized constraints — those 
in a "strictly active working set" — and permitted slight violations of the nonen- 
forced constraints yet achieved superlinear convergence under weaker-than-usual 
conditions. 

Bonnans H showed that when strict complementarity fails to hold but the 
active constraint gradients are linearly independent, then the standard SQP 
algorithm (in which any nonuniqueness in the solution of the SQP subproblem 
is resolved by taking the solution of minimum norm) converges superlinearly. 

Our concern here is with local behavior, so we assume availability of a start- 
ing point (z°, A ) that is "sufficiently close" to the optimal primal-dual set. We 
believe, however, that ingredients of the approach proposed here can be embed- 
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ded in practical algorithms, such as SQP algorithms that include modifications 
(merit functions and filters) to ensure global convergence. We believe also that 
this approach could be used to enhance the robustness and convergence rate of 
other types of algorithms, including augmented Lagrangian and interior-point 
algorithms, in problems in which there is degeneracy at the solution. We mention 
one such extension in Section H. 



2. Assumptions, Notation, and Basic Results 

We now review the optimality conditions for (|l|) and outline the assumptions 
that are used in subsequent sections. These include the second-order sufficient 
condition we use here, the Mangasarian-Fromovitz constraint qualification, and 
the definition of weakly- active indices. 

Recall the KKT conditions (||) . The set of "optimal" Lagrange multipliers A* 
is denoted by S\, and the primal-dual optimal set is denoted by S. Specifically, 
we have 

S x = {A* | A* satisfies (3)}, S = {z*} x S x . (5) 

An alternative, compact form of the KKT conditions is the following variational 
inequality formulation: 

v<Xz*) + v ff (z*)A*l r o ] 

g{z*) \ fc [n(X*)\ ' [ ' 

where N(X) is the set defined by 

mh ^ f{y\y<° and y Tx = 0} if a > o, 

NW ~ \ otherwise. {7) 

We now introduce notation for subsets of the set B of active constraint indices 
at z*, defined in (||). For any optimal multiplier A* G S\, we define the set B+(\*) 
to be the "support" of A*, that is, 

B+(\*) = {ieB\ A* > 0}. 

We define B+ (without argument) as 

£ + d = f U A * e5A B + (A*); (8) 

this set contains the indices of the strongly active constraints. Its complement in 
B is denoted by £>o, that is, 

B Q = B\B+. 

This set Bq contains the weakly active constraint indices, those indices i € B 
such that A* = for all A* G S\. In later sections, we make use of the quantity 
e\ defined by 

e\ d = max min A*. (9) 
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Note by the definition of 23+ that e\ > 0. 

The Mangasarian-Fromovitz constraint qualification (MFCQ) |n| holds at 
z* if there is a vector y £ R™ such that 

^9i{z*) T y < for all i e B. 

By defining Vgg to be the n x \B\ matrix whose rows are V<?;(-), i G B, we can 
write this condition alternatively as 

Vg B (z*fy < 0. (10) 

It is well known that MFCQ is equivalent to boundedness of the set S\; see 
Gauvin @. 

Since S\ is defined by the linear conditions V(f>(z*) + V.g(z*)A* and A* > 0, 
it is closed and convex. Therefore, under MFCQ, it is also compact. 

We assume throughout that the following second-order condition is satisfied: 
there is a > such that 

w T £ zz (z*,X*)w>a\\w\\ 2 , forallA*G5 A , (11) 

and for all w such that 

V gi (z*) T w = 0, for alH G B+, , , 

V gi (z*) T w < 0, for all i S B . { 1 

This condition is referred to as Condition 2s. 1 in Jl6|, Section 3]. Weaker second- 
order conditions, stated in terms of a quadratic growth condition of the objective 
4>{z) in a feasible neighborhood of z* , are discussed by Bonnans and Ioffe g and 
Anitescu Q. 

Our standing assumption for this paper is as follows. 

Assumption 1. The first- order conditions Q), the MFCQ (fT^), and the second- 
order condition fiTi^ ) are satisfied at z* . Moreover, the functions <p and g 
are twice Lipschitz continuously differ entiable in a neighborhood of z* . 

The following is an immediate consequence of this assumption. 

Theorem 1. Suppose that Assumption |I| holds. Then z* is an isolated station- 
ary point and a strict local minimizer of 

Proof. See Robinson |L3[ Theorems 2.2 and 2.4]. 

We use the notation S(-) to denote distances from the primal, dual, and 
primal-dual optimal sets, according to context. Specifically, we define 

5{z) d = \\z-z% 5{\) d = dist(A,<S A ), S(z,X) d = dist ((z, A), 5), (13) 

where || • || denotes the Euclidean norm unless a subscript specifically indicates 
otherwise. We also use P(X) to denote the projection of A onto <Sa; that is, we 



Constraint Identification for Degenerate Nonlinear Programs 



5 



have P(A) G S x and ||P(A) - A|| = dist(A,5 A ). Note that from Q we have 
8(z) 2 + <5(A) 2 = S(z, A) 2 , and therefore 

S(z)<6(z,X), 6(\)<6(z,\). (14) 

Using Assumption ^, we can prove the following result, which gives a practical 
way to estimate the distance 8{z, A) of (z, A) to the primal-dual solution set S. 

Theorem 2. Suppose that Assumption ^ holds. Then there are positive con- 
stants 5, kq, and k± such that for all (z, A) with 6(z, A) < 5, the quantity rj(z, A) 
defined by 



T)(Z, A) = 



C z (z, A) 
min(A, -g(z)) 



(15) 



(where min(A, —g(z)) denotes the vector whose ith component is min(Ai, — gi(z)) ) 
satisfies 

kqS(z 1 A) < r](z, A) < KiS(z, A). 

See Facchinei, Fischer, and Kanzow || Theorem 3.6], Wright jl6|, Theorem A.l], 
and Hager and Gowda ^ Lemma 2] for proofs of this result. (The second-order 
condition is stated in a slightly different fashion in || but is equivalent to (pi]), 

We use order notation in the following (fairly standard) way: If two matrix, 
vector, or scalar quantities M and A are functions of a common quantity, we 
write M = 0(||A||) if there is a constant j3 such that ||M|| < f3\\A\\ whenever 
||A|| is sufficiently small. We write M — J?(||A||) if there is a constant j3 such 
that ||M|| > /3 _1 ||A|| whenever ||A|| sufficiently small, and M = 9{\A\) if both 
M = 0{\\A\\) and M = ft{\\A\\). We write M = o(||A||) if for all sequences {A k } 
with \\Ak\\ — > 0, the corresponding sequence {Mk} satisfies ||Mfc||/||^4fc|| — » 0. By 
using this notation, we can rewrite the conclusion of Theorem || as follows: 

V (z,\)=0(S(z,\)). (16) 



3. Detecting Active Constraints 

We now describe a procedure, named Procedure IDO, for identifying those in- 
equality constraints that are active and the solution, and classifying them ac- 
cording to whether they are weakly active or strongly active. We prove that Pro- 
cedure IDO classifies the indices correctly given a point (z. A) sufficiently close to 
the primal-dual optimal set S. Finally, we describe some implementation issues 
for this procedure. 
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3.1. The Detection Procedure 

Facchinei, Fischer, and Kanzow showed that the function rj(z, A) defined in 
( |f6| ) can be used as the basis of a scheme for identifying the active set B. Choosing 
some t £ (0, 1), they estimated 

A(z, A) H f {i = 1, 2, . . . , m | 9i {z) > -r)(z, X) T }. (17) 
We have the following result. 

Theorem 3. Suppose that Assumption ^ holds. Then there exists S > such 
that for all (z, A) with 5{z, A) < 8, we have A{z, A) = B. 

Proof. The result follows immediately from |^|, Definition 2.1, Theorem 2.3] and 
Theorem ^ above. 

A scheme for estimating B + (hence, Bq) is described in [||, but it requires 
the strict MFCQ condition to hold, which implies that S\ is a singleton. Here 
we describe a more complicated scheme for estimating £> + that requires only the 
conditions of Theorem || to hold. 

Our scheme is based on linear programming subproblems of the following 
form, for a given parameter r € (0, 1) and a given set A C A(z, A): 

max A J2ieA subject 1° (18a) 

- v (z, xy < V0(z) + E l eA( Z ,x ) ^v^(z) < ^z, \y (isb) 

Aj > 0, for all i e A(z, A); A, = otherwise. (18c) 

Note that the objective function involves elements Aj only for indices i in the 
subset A, whereas the Aj are permitted to be nonzero for all i G A(z, A). The idea 
is that A contains those indices that may belong to Bo', by the time we solve 
(|l8|), we have already decided that the other indices i S A(z,X)\A probably 
belong to B+. 

The complete procedure is as follows. 

Procedure IDO 

Given constants r and f satisfying < f < r < 1, and point (z, A); 
Evaluate 77(2, A) from ( |l5| ) and A(z, A) from jl7|); 
Define Anit = A{z, \ A, > r)(z, A) f }; 

A < ^init ? 

repeat 

solve ( p^ ) to find A; 

set C = {ieA\\i> ?y(z,A) f }; 

if C = 

stop with A) = A A+ = A(z, X)\A; 

else 

set A <- ^4\C; 
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ifi = 

stop with A = 0, A+ = A{z, A); 
end(if) 
end(if ) 
end(repeat) 



This procedure terminates finitely; in fact, the number of times the "repeat" 
loop executes is bounded by the cardinality of -Aimt- 

We prove that Procedure IDO successfully identifies B+ (for all 5(z,X) suffi- 
ciently small) in several steps, culminating in Theorem^. First, we estimate the 
distance of (z, A) to the solution set S, where A is the solution of ( jl^ ) for some 
A. 

Lemma 1. Suppose that Assumption |I| holds. Then there are positive constants 
Sq and k 2 such that whenever 8{z, A) < <5o, any feasible point A of fif^ ) at any 
iteration of Procedure IDO satisfies 

5(z,X) < k 2 6(z,\) t . 

Proof. Initially choose So ~ S for i5 defined in Theorem ||, so that A{z, A) = B. 
Hence, we have A C B at all iterations of Procedure IDO. 

We now estim ate rj(z, A) using the definition (fl5|). We have directly from the 



constraints (18b) that 

||£»(*,A)||oo <^,A) T . 

For the vector min(A, — g(z)), we have for i £ B that gi{z*) = and A^ > 0, and 
so 

ieB => \mm(\ i ,-g i (z))\<\g i (z)\ = 0(\\z-z*\\) = 0{5{z,\)). 
Meanwhile for i ^ B = A(z, A), we have A^ = and gi{z*) < 0, and so 

i£B [imniXi, -gi(z))\ = max(0, gi{z)) <\gi{z)-gi(z*)\=0(S(z,X)). 

By substituting these estimates into (|l5|), and using the equivalence of || • ||oo 
and the Euclidean norm and the result of Theorem ||[ we have that there is a 
constant R 2 > such that 

vi z i a) < r%8{z, \y . 

Using Theorem ^ again, we have 

S(z, A) < k ^(z, A) < k - 1 R 2 5(z ) X) T , (19) 

giving the result. 

In the next two lemmas and Theorem ^, we show that for S(z, A) sufficiently 
small, Procedure IDO terminates with Ao = Bo and A+ = B+. 

Lemma 2. Suppose that Assumption [I] holds. Then there is Si > such that 
whenever S(z, A) < Si, Procedure IDO terminates with Bo C .Ao- 
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Proof. Since we know the procedure terminates finitely, we need show only that 
Bq C A at all iterations of the procedure. Initially set Si = 5q < S, so that 
A(z, A) = B and the result of Lemma [l] holds. Suppose for contradiction there 
is an index j 6 Bo such that j either is not included in the initial index set _4i n it 
or else is deleted from A at some iteration of Procedure IDO. 

Suppose first that j is not included in .Ainit- Then we must have Xj > rj(z, X) T , 
which by Theorem [2] implies that 

5(z,X) > \Xj\ > n(z,X) f > 4 S ( Z ^Y- (20) 

However, by decreasing Si and using f S (0, 1), we can ensure that ( p0|) does not 
hold whenever 8{z, X) < 5\. Hence, j is included in ^nit- 
Suppose now that j G Bq is deleted from A at some subsequent iteration. 
For this to happen, the subproblem (|l8| ) must have a solution A with 

Aj>77(z,A) f (21) 

for some A C B. Hence from Theorem we have that 

S(z, A) > Aj > rj(z, Xf > 45(z, Xf . (22) 

By combining the result of Lemma [l] with (|2^), we have that 

K 2 s{z,xy > 4s(z,xy. 

However, this inequality cannot hold when S(z, A) is smaller than (kJk^ 1 ) 1 ^ 1 " -1 "). 
Therefore, by decreasing 81 if necessary, we have a contradiction in this case also. 



Lemma 3. Suppose that Assumption ^ holds. Then there is 62 > such that 
whenever S(z,X) < 82, Procedure IDO terminates with B + C A+. 

Proof. Given any j 6 we have for sufficiently small choice of 62 that j G 
A(z,X). We prove the result by showing that Procedure IDO cannot terminate 
with j E Ao- 

We initially set £2 =5%, where S\ is the constant from Lemma |^. (We reduce it 
as necessary, but maintain 62 > 0, in the course of the proof.) For contradiction, 
assume that there is j G B+ such that j G A at all iterations of Procedure IDO, 
including the iteration on which the procedure terminates and sets Aa = A. 
Recalling the definition (||) of e\, we use compactness of S\ to choose A* G S\ 
such that e\ = mini 6 g + A*. In particular, we have 

A* > £A > 

for our chosen index j. We claim that, by reducing 62 if necessary, we can ensure 
that A* is feasible for ( |l8| ) whenever S(z,X) < 62- Obviously, since A(z, A) = B 
by Theorem |3|, A* is feasible with respect to (18c). Since A* G S\ and 



\\z-z*\\<5{z,X)<k - 1 v{z,X), 
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we have 
V0(z)+^A*V 5l (; 



< M\\z - z*\\ < Mn^ ) 1 ri(z, A), 



(23) 



for some constant M that depends on the norms of V 2 0(-) and V 2 <?i(-), i G £>+ 
in the neighborhood of z* and on a bound on the set S\ (which is bounded, 
because of MFCQ). Since r < 1 and since r/(z,X) = 0(6(z,X)), we can reduce 
62 if necessary to ensure that 



whenever 6(z,X) < 82, thereby ensuring that the constraints (18b) are satisfied 
by A*. 

Since A* is feasible for (|l8|), a lower bound on the optimal objective is 



ieA 



However, since Procedure IDO terminates with j £ A, we must have that C = 
for the solution A of (|l8|) with this particular choice of A. But we can have C = 
only if Xi < r/(z, A) T for all i £ A, which means that the optimal objective is 
no greater than mrj{z,X) T . But since 77(2, A) = 0(5(z, A)), we can reduce 62 if 
necessary to ensure that 

mr](z, X) T < e\ 

whenever 5{z, A) < $2- This gives a contradiction, so that Aq (which is set by 
Procedure IDO to the final .A) can contain no indices j £ B+. Since £>+ C B = 
A(z, A) whenever 5(z, A) < 62, we must therefore have £>+ C A+, as claimed. 

By using the quantity 62 from Lemma ||, we combine this result with Theo- 
rem |3| and Lemma || to obtain the following theorem. 

Theorem 4. Suppose that Assumption ^ holds. Then there is 62 > such that 
whenever 8{z 1 A) < 62, Procedure IDO terminates with A+ = B+ and Aq = Bq. 



3.2. Scheme for Finding an Interior Multiplier Estimate 

We now describe a scheme for finding a vector A that is close to S\ but not too 
close to the relative boundary of this set. In other words, the quantity mini gl g + A^ 
is not too far from its maximum achievable value e\. 

We find A by solving a linear programming problem similar to (|l^) but con- 
taining an extra variable to represent minj g g + A^. We state this problem as 
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follows: 



maXj- ? t subject to 
t < A,, for all i G A+, 
-r){z, X) T e < V0(z) + A.Vt^z) < 77(2, A) T e 

Aj > 0, for all i G A+; \ = otherwise. 



(24a) 
(24b) 
(24c) 
(24d) 



Theorem 5. Suppose that Assumption^ holds. Then there is a positive number 
<5 3 such that (Q) is feasible and bounded whenever 8(z, A) < 83, and its optimal 
objective is at least e\ (for e\ defined in n^)). Moreover, there is a constant 
(3' > such that 5{z, A) < [3'5(z, X) T . 

Proof. Let A* G S\ be chosen so that e\ = mirii e ;5 + A*. We show first that 



(i, A) = (e\, A*) is feasible for (24), thereby proving that this linear program is 
feasible and that the optimum objective value is at least e\. 

Initially we set £3 — 82- By Definition (|J), the constraint (24b) is satisfied 
by (f, A) = (e\,X*). Since S(z, A) < 83 = 82, we have from Theorem p| that 



A+ = B+, so that (24d) also holds. Satisfaction of (24c) follows from (E3h , by 
choice of 82- Moreover, it is clear from A+ = B+ that the optimal (i, A) will 
satisfy t — mini e g + A^ 



We now show that the problem (24) is bounded for 8{z, A) sufficiently small. 
Let y be the vector in ( |l(3| ) , and decrease £3 if necessary so that we can choose 
a number £ > such that 



8{z,X) <8 3 



y T Vg l (z) < -C, for all i e A+ = B^ 



(25) 



From the constraints (24c) and the triangle inequality, we have that 



ky T V gi {z) 



ieAj, 



< \\y T mz)h 



\<t>(z)+ hy T V gi {z) 

i£A + 



< ll»||i||V0(z)|L + ||y|h 



ieA+ 

< llyllillv^lL + llylli^Ar. 



However, from (123) and Xi > 0, i G A+, we have that 



^ A,y T V. 9l (z) 



> 



A^ 4 



By combining these bounds, we obtain that 



A^ 4 



<Cly||iulv^)lloo + ^Ml, 
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whenever S(z, A) < 83, so that the feasible region for ( pi| ) is bounded, as claimed. 

To prove our final claim that 6(z,X) < [3'5(z,X) T for some j3' > 0, we use 
Theorem |2|. We have from ( |24cf ) and the cited theorem that 

£ z (z, A) < ?7(z,A) T < k[5(z, A) r . 
00 

For i 6 A+ = £>+, we have from A, > e\ and gi{z*) — that 

i e A+ min(A. t , -gi{z)) < \gi(z)\ < \g l {z) - g l (z*)\ 

= OQ\z — z*\\) = 0(6(z,X)). 

For i ^ A+, we have X\ = and gi(z*) < 0, and so 

i i A+ => min(Ai, -gi(z)) = max(0, gi(z)) < \g%(z) - g%(z*)\ 

= 0(\\z-z*\\) = 0(5(z,X)). 



By substituting the last three bounds into ( |15[ ) and applying Theorem we 
obtain the result. 



3.3. Computational Aspects 

Solution of the linear programs ( [j"8| ) is in general less expensive than solution 
of the quadratic programs or complementarity problems that must be solved 
at each step of an optimization algorithm with rapid local convergence. Linear 
programming software is easy to use and readily available. Moreover, given a 
point (z, A) with 6(z, A) small, we can expect Ai n n not to contain many more 
indices than the weakly active set Bo, so that few iterations of the "repeat" loop 
in Procedure IDO should be needed. 

Finally, we note that when more than one iteration of the "repeat" loop is 
needed in Procedure IDO, the line ar p rograms to be solved at successive iterations 



differ only in the cost vector in (18a). Therefore, if the dual formulation of ( 
is used, the solution of one linear program can typically be obtained at minimal 
cost from the solution of the previous linear program in the sequence. To clarify 



this claim, we simplify notation and write ( pL8|) as follows: 

maxc T 7r subject to b\ < An < hi, tt > 0, (26) 

where n = [Aj]jg^4( Z! >), while c, 61, 62, and A are denned in obvious ways. In 
particular, c is a vector with elements and 1, with the l's in positions corre- 
sponding to the index set A. The dual of ( ptf ) is 

max bjyi + b\yi subject to 



[A T -A T I] 



Hi 

1)2 



-c, (yi,y2,s)>o. 
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When the set A is changed, some of the l's in the vector c are replaced by zeros. 
When only a few such changes are made, and the previous optimal basis is used 
to hot-start the method, we expect that only a few iterations of the dual simplex 
method will be needed to recover the solution of the new linear program. 

4. SQP and Stabilized SQP 

In the best-known form of the SQP algorithm (with exact second-order infor- 
mation) , the following inequality constrained subproblem is solved to obtain the 
step Az at each iteration: 

min^ Az T V(j)(z) + \Az J 'C zz (z, X)Az, (27) 
subject to g(z) + XI g(z) T Az < 0, 

where (z, A) is the current primal-dual iterate. Denoting the Lagrange multipliers 
for the constraints in by A + , we see that the solution Az satisfies the 
following KKT conditions (cf. (0)): 

' C zz (z, X)Az + V0(z) + Vg(z)\ 
g(z) + Vg(z) T Az 

where N(-) is defined as in (0). 

In the stabilized SQP method, we choose a parameter ijl > and seek a 
solution of the following minimax subproblem for (Az, A + ) such that (Az, A + — A) 
is small: 

min max Az T V4>(z) + \Az T C zz (z, \)Az (29) 

Az A+>0 

+ (\+f[g(z) + Vg(z) T Az] - \n\\\ + - A|| 2 . 

The parameter /i can depend on an estimate of the distance 6(z, A) to the primal- 
dual solution set; for example, /i = r](z, A) CT for some a € (0, 1). We can also write 
( p9| ) as a linear complementarity problem, corresponding to (p8|), as follows: 

~£ zz (z, X)Az + V0(z) + Vg(z)X 
g(z) + Vg(z) T Az - M (A+ - A) 

Li and Qi jl0| derive a quadratic program in (Az, A + ) that is equivalent to ( p9| ) 
and ©: 

mm (AziX+) Az T \7cj)(z) + \Az T C zz (z, X)Az + i/i|lA+|| 2 , (31) 
subject to g(z) + Vg(z) T Az - [i(\+ - A) < 0. 

Under conditions stronger than those assumed in this paper, the results of 
Wright ]l8t and Hager || can be used to show that the iterates generated by 
( p9| ) (or (|30|) or (|3l|)) yield superlinear convergence of the sequence (z k ,X k ) of 
Q-ordcr 1 + a. Our aim in the next section is to add a strategy for adjusting the 
multiplier, with a view to obtaining superlinear convergence under a weaker set 
of conditions. 








6 


N(X+) 



(28) 








e 


N(\+) 



(30) 
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5. Multiplier Adjustment and Superlinear Convergence 

We show in this section that through use of Procedure IDO and the multiplier 
adjustment strategy (p4f), we can devise a stabilized SQP algorithm that con- 
verges superlinearly whenever the initial iterate (z , A ) is sufficiently close to 



1| is needed for this result. 
We state this result in Ap- 



the primal-dual solution set S. Only Assumption 

Key to our analysis is Theorem 1 of Hager 
pendix |X], using our current notation and making a slight correction to the 
original statement. Here we state an immediate corollary of Hager's result that 
applies under our standing assumption. 

Corollary 1. Suppose that Assumption [J holds, and let A* £ S\ be such that 
A* > for all i 6 B+. Then for any sufficiently large positive o-q, there are 
positive constants po, o~\, 7 > 1, and (3 such that croPo < with the following 
property: For any (z°,A°) with 

||(z ,A°)-(z* ; A*)|| < Po , (32) 

we can generate an iteration sequence {(z k , X k )}, k = 0, 1, 2, . . by setting 

(z k+1 ,X k+1 ) = {z k + Az,X+), 

where, at iteration k, {Az, A + ) is the local solution of the sSQP subproblem with 

(z,X) = (z k ,X k ), f i = fx k e[a \\z k -z*\\,a 1 }, (33) 

that satisfies 

\\(z k + Az, A+) - (z*, A*)|| < 7 ||(z°, A ) - A*)|| . (34) 
Moreover, we have 

S(z k+1 ,X k+1 ) < ft [8{z k X k f + f i k S{X k )] . (35) 

Recalling our definition (^) of e\, we define the following parametrized subset 
ofS A : 

S% = {X e S x | min A, > ve x }. (36) 

It follows easily from the MFCQ assumption and (^) that 5^ is nonempty, closed, 
bounded, and therefore compact for any v € [0, 1]. 

We now show that the particular choice of stabilization parameter p = 
r](z, X) a , for some a e (0, 1), eventually satisfies (|33"|). 

Lemma 4. Suppose the assumptions of Corollary Q] are satisfied, and let A* 
be as defined there. Let a be any constant in (0,1). Then there is a quantity 
P2 £ (0,po] such that when (z°,A°) satisfies 

||(z ,A°)~(z*,A*)||<p 2 , (37) 

the results of Corollary [J hold when we set the stabilization parameter at iteration 
k to the following particular value: 

p = p k = V (z k ,X k y. (38) 
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Proof. We prove the result by showing that fik defined by (|38|) satisfies ( p3| ) 
for some choice of pi. For contradiction, suppose that no such choice of pi is 
possible, so that for each £ = 1, 2, 3, . . ., there is a starting point (z? e] , Afli) with 



< 



A) 



(39) 



such that the sequence | (^j? ^^]J j generated from this starting point 

in the manner prescribed by Corollary [l] with = rj(z^, Xf^) a eventually comes 
across an index kg such that this choice of violates (p3|), that is, one of the 
following two conditions holds: 



CO 



z fe * - z" 



. (40a) 

<7i<7rt4i> A ra) a - ( 40b ) 

Assume that £^ is the first such index for which the violation (En) occurs. By 



( p4| ) and (|39j), we have that 

|(«g,Ag) ~(z*,X*)\\ <t||(4],A^) -(z*,\*)\\ <-yt-W (41) 
Therefore by Theorem || and ([l3|), we have for £ sufficiently large that 



hi s) 



J (*M' A w) 
Hence, taking limits as £ j oo, we have that 



(42) 



1'1 



oo as £ | oo. 



we conclude from finiteness of <7n 



Dividing both sides of (40a) by 

that (40a) is impossible. 

By using Theorem || again together with (41), we obtain 



A m) " ( z *' A *) 



and therefore rj (z^*, — > as ^ f 00. Hence, (40b) cannot occur either, and 
the proof is complete. 



Constraint Identification for Degenerate Nonlinear Programs 



15 



We now use a compactness argument to extend Corollary ^ from the single 
multiplier A* in the relative interior of S\ to the entire set <S^, for any v £ (0, 1]. 

Theorem 6. Suppose that Assumption ^ holds, and fix v £ (0,1]. Then there 
are positive constants 6, 7 > 1, and such that the following property holds: 
Given (z , A ) with 

dist ((z°,\°),Sx) < 5, 

the iteration sequence {(z k , X k )}k=o,i,2,... generated in the manner described in 
Corollary with /i/,, k — 0, 1, 2 . . . chosen according to $Hy, satisfies the fol- 
lowing relations: 

5{z k+ \\ k+1 ) < 0S(z k ,X k ) 1+c (43a) 
Af > -vex, for alii £ B+ and all k = 0,1,2.... (43b) 

Proof. For each A* 6 £>^, we use Corollary |l| to obtain positive constants <7o(A*) 
(sufficiently large), cti(A*), 7(A*), and 0(X*), using the argument A* for each 
constant to emphasize the dependence on the choice of multiplier A*. In the 
same vein, let P2(A*) £ (0,po(A*)] be the constant from Lemma | Now choose 
S(X*) > for each A* £ S% in such a way that 

< <S(A*) < ip 2 (A*), (44a) 
7(A*)<5(A*) < \ vex , (44b) 

and consider the following open cover of S% : 

U A . e 5j{A|||A-Al<$(A*)}. (45) 

By compactness of S% , we can find a finite subcover defined by points A 1 , A 2 , . . . , A' £ 
S% as follows: 

SX C V d =i f U j=1 , 3l ..., f {A I ||A - A'|| < ■ (46) 

V is an open neighborhood of 5^. Now define 

7 d = max 7 (A J ), d = max /3(A J ), 6 d = max 6{\ 3 ). (47) 

3=1,2,...,/ 3=1,2,...,/ 3 = 1,2,...,/ 

Also, choose a quantity S > with the following properties: 

8 < min S(X j ) < S, (48a) 
~ 3=1,2,...,/ ' - 

{A|dist(A,5£) <*} C V, (48b) 
I < ^, (48c) 
5 < 1. (48d) 
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Now consider (z°,X°) with 

||0°, A°) - (js*,A")|| < 6, for some A* G S%. 



(49) 



We have dist(A°, S%) < 8, and so A e V. It follows that for some j = 1, 2, . . . , /, 
we have 

||A°- #|| (50) 



(51) 



Moreover, since \\z° — z*\\ < 5, we have from j48aj ) that 

( 2 °, A ) - (z*, X j ) < 6 + 5(X j ) < 25{X r ) < P2(X J ): 



where the final inequality follows from (44a). Application of Corollary [l] and 
Lemma ^ now ensures that the stabilized SQP sequence starting at (z°, A ) with 
H = /ife chosen according to (p8) yields a sequence {(z k X k )}k=o,i,2,... satisfying 



(z k ,X k )-(z*,X) < 7 (#) (z°,X°)-(z*,X) 
< 2 7 (A J )<S(A J ') < 2-yS, 



(52) 



where we use d (ff7| ) to obtain the final inequality. 

To prove fl43a| ), we have from Lemma |, Corollary |, the bound (0), Theo- 
rem ||, the definition (E^) , and the stabilizing parameter choice (|8|) that 



S(z k+1 ,X k+1 ) < (3(X j ) [S(z k , X k f + (i k 5(X k )] 

< [8{z k , X k ) 2 + r)(z k X k )' 7 6{z k , X k )] from @ and @ 

< (3 [S(z k ,X k f + K a 1 S{z k ,X k ) 1+a ] from Theorem | 
<P({2 1 5) 1 - a + Kl)5{z k 1 X k ) 1+a , 

where i n th e last line we use 8(z , X k ) < dist((z fe , X k ),S%) < 2jS. Therefore, the 
result ( pa] ) follows by setting = ((2-fS) 1 -' 7 + k£). 

Finally, we have from (44b) (with A* = X j ) and (|5|) that 

.,k \k\ 



dist ((z k , X k ),SZ) < 2 7 (A^)£(A') < -ve x . 



Therefore, we have 



ieB+ =*> Xf > min A- 



1 



-ve x > vt\ 



verifying ( |43q ) and completing the proof. 

We are now ready to state a stabilized SQP algorithm, in which multiplier 
adjustment steps (consisting of Procedure IDO followed by solution of (|24|)) are 
applied when the convergence does not appear to be rapid enough. 
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Algorithm sSQPa 

given a € (0, 1), r and f with < f < r < 1, tolerance tol; 
given initial point (z°,A°) with A° > 0; 
k <- 0; 

calculate .4(2:°, A ) from (|l7|); 

call Procedure IDO to obtain A+, Ao', solve (^4|) to obtain A ; 

A° <- A ; 

repeat 

solve ( p9| ) with (z, A) = (z k 1 X k ) and /i = [i k = r\{z k , A fe ) cr 

to obtain A + ); 
if r?(z fe + Az, A+) < rj(z k , \ k y+°/ 2 

(z k+1 ,X k+1 ) «- (z k + Az,X+); 

k <- k + 1; 

else 

calculate ^l(z fe ,A' £ ) from fll7|); 

call Procedure IDO to obtain A+ , Ao ; solve ( |24| ) to obtain A fc ; 
A fc «- A fc ; 
end (if) 
until r)(z k , \ k ) < tol. 

The following result shows that when (z°, A ) is close enough to S, the initial 
call to Procedure IDO is the only one needed. 

Theorem 7. Suppose that Assumption [| holds. Then there is a constant 5 > 
such that for any (z°,A°) with 5(z a ,X°) < 5, the "if" condition in Algorithm 
sSQPa is always satisfied, and the sequence 8(z k ,X k ) converges superlinearly to 
zero with Q- order 1 + a . 

Proof. Our result follows from Theorems | and | Choose v = 1/2 in Theorem o, 
and let i5, 7, and (3 be as defined there. Using also S3 and /?' from Theorem g 
and e\ defined in (^), we choose 5 as follows: 

* =min ( M '(iO '(£) '(W^' Ko te) )• (53) 

Now let (z°, A ) satisfy S(z°,X°) < 5, and let A be calculated from (|I|). From 
Theorem || and (^3|) , we have that 

6(z°, A ) < (3'5(z°, X°Y < p'6 T < l -e x (54) 

and 

A° > e x , foralHGB+, (55a) 

A° = 0, for all i (£ B+. (55b) 



18 



Stephen J. Wright 



Since S\ is closed, there is a vector A* G S\ such that 

S(z°,X°)= {z°,X°)-(z*,X*) 



(56) 



From (|54|) and (55a), we have that 



so that A* G S% for v = 1/2. We therefore have from (|J), @, and (||) that 
dist((z°,A ),^) = (z°,A°) - (z*,A*) <(3'S T <8. (57) 

From here on, we set A <— A , as in Algorithm sSQPa. Because of the last 
bound, we can apply Theorem ^ to (z°,A°). We use this result to prove the 
following claims. First, 



Second, 



5 > 8{z\ A°) > 25(z\ A 1 ) > iS{z 2 , A 2 ) > ■ 



r)(z k+1 ,\ k+1 ) < r,(z k 1 X k ) 1+a/2 , for all k = 0, 1, 2, . 



(58) 



(59) 



We prove both claims by induction. For k = in (p5S|), we have from ( |57| ) and 
<5 < <5 in ( p3| ) that <5(z°, A ) < 5. Assume that the first k + 1 inequalities in ( |58| ) 
have been verified. From ( f43a| ) and (|3|), we have that 

S(z k+1 , X k+1 ) < /35(z fc , A fc ) 1+CT < /3^<5(z fc ,A fc ) < i<5(z fc ,A fc ), 

so that the next inequality in the chain is also satisfied. For (|59|), we have from 
Theorem |, (pah, and @ that 



T 1 {z k+ \\ k+1 ) < Kl S(z k+ \\ k+1 ) 

< (3K 1 S(z k ,\ k ) 1+a 

< p Kl 6 a/2 6(z k 1 X k ) 1+a/2 

<(5K^I 2 K- X - al \{z k ,\ k ) 1+a l 2 

< V (z k ,X k ) 1+ ^ 2 , 



where the last bound follows from (53). Hence, (|59| ) is verified, so that the 
condition in the "if" statement of Algorithm sSQPa is satisfied f or a ll k = 
0,1,2,.... Super linear convergence with Q-order 1 + a follows from ( ]43a| ) . 
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6. Summary and Possible Extensions 

We have presented a technique for identifying the active inequality constraints 
at a local solution of a nonlinear programming problem, where the standard 
assumptions — existence of a strictly complementary solution and linear inde- 
pendence of active constraints gradients — are replaced by weaker assumptions. 
We have embedded this technique in a stabilized SQP algorithm, resulting in a 
method that converges superlinearly under the weaker assumptions when started 
at a point sufficiently close to the (primal-dual) optimal set. 

The primal-dual algorithm described by Vicente and Wright Jh| can also be 
improved by using the techniques outlined here. In that paper, strict comple- 
mentarity is assumed along with MFCQ, and super linear convergence is proved 
provided both S(z°, A ) is sufficiently small and A° > 7, for all i S B = £>+ and 
some 7 > 0. If we apply the active constraint detection procedure (|l7| ) and the 
subproblem ( pif ) to any initial point (z°,A°) with 6(z°, A ) sufficiently small, 
the same convergence result can be obtained without making the positivity as- 
sumption on the components of Ag + . (Because of the strict complementarity 
assumption, Procedure IDO serves only to verify that B = 

Numerous issues remain to be investigated. We believe that degeneracy is 
an important issue, given the large size of many modern applications of non- 
linear programming and their nature as discretizations of continuous problems. 
Nevertheless, the practical usefulness of constraint identification and stabiliza- 
tion techniques remains to be investigated. The numerical implications should 
also be investigated, since implementation of these techniques may require so- 



lution of ill-conditioned systems of linear equations (see M. H. Wright 15 and 
S.J. Wright Embedding of these techniques into globally convergence algo- 
rithmic frameworks needs to be examined. We should investigate generalization 
to equality constraints, possibly involving the use of the "weak" MFCQ con- 
dition, which does not require linear independence of the equality constraint 
gradients. 
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A. Hager's Theorem 

We restate Theorem 1 of Hager Q, making a slight correction to the original 
statement concerning the conditions on (z°, A ) and the radius of the neighbor- 
hood containing the sequence {(z k ,X k )}. No modification to Hager's analysis is 
needed to prove the following version of this result. 

Theorem 8. Suppose that z* is a local solution of (jl|), and that <f> and g are 
twice Lipschitz continuously differentiable in a neighborhood of z* . Let A* be 
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some multiplier such that the KKT conditions (Qj are satisfied, and define 

B d = {i | A* > 0}. 

Suppose that there is an a > such that 

w T ' C zz {z* , X*)w > a||u;|| 2 , /or all w such that V 'gi(z*) J w — 0, /or all i £ B. 

Then for any choice of Co sufficiently large, there are positive constants po, <j\, 
7 > 1, and such that <7oPo < 0\> with the following property: For any {z , A ) 
with 

\\(z ,X°)-(z*,X*)\\<p a , 
we can generate an iteration sequence {(z k , X k )}, k = 0, 1, 2, . . by setting 

(z k+1 ,X k+1 ) = {z k + Az,X+), 

where, at iteration k, (Az, A + ) is the local solution of the sSQP subproblem with 

(z,X) = (z k ,X k ), M = Mfe e [tro||2 fc -«*||,ffi], 

t/iat satisfies 

\\(z k + Az,X+) - (z*,X*)\\ < 7 ||(z°,A )- (z*,A*)|| . 
Moreover, we have 

S(z k+1 ,X k+1 ) < ^(z^A*) 2 + fi k S{X k )] . 
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